Localization method and apparatus of displaying virtual object in augmented reality

ABSTRACT

Disclosed is a localization method and apparatus that may acquire localization information of a device, generate a first image that includes a directional characteristic corresponding to an object included in an input image, generate a second image in which the object is projected based on the localization information, to map data corresponding to a location of the object, and adjust the localization information based on visual alignment between the first image and the second image.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is a continuation of U.S. application Ser. No.16/259,420, filed on Jan. 28, 2019, and claims the benefit under 35 USC§ 119(a) of Korean Patent Application No. 10-2018-0108252 filed on Sep.11, 2018 in the Korean Intellectual Property Office, the entiredisclosures of which are all incorporated herein by reference for allpurposes.

BACKGROUND 1. Field

The following description relates to a localization method and apparatusfor displaying a virtual object in augmented reality (AR).

2. Description of Related Art

Various types of augmented reality (AR) services are provided in variousfields, such as, for example, driving assistance for vehicles and othertransportation devices, games, and amusements. Various localizationmethods may be used to provide realistically AR. For example, asensor-based localization method may use various sensors, for example, aglobal positioning system (GPS) sensor and an inertial measurement unit(IMU) sensor, to verify a location and a direction of an object. Whenhigh accuracy is required, a sensor-based localization method requires avery price sensor with high accuracy, and thus, commercialization andminiaturization is difficult. Also, a vision-based localization methodusing camera information to acquire highly precise coordinateinformation may be difficult to use in an environment with many dynamicobjects having continuous motions.

SUMMARY

This Summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This Summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used as an aid in determining the scope of the claimed subjectmatter.

In one general aspect, there is provided a localization method includingacquiring localization information of a device, generating a first imageincluding a directional characteristic corresponding to an objectincluded in an input image, generating a second image in which theobject is projected, based on the localization information, on to mapdata corresponding to a location of the object, and updating thelocalization information based on visual alignment between the firstimage and the second image.

The localization information may include a location of the apparatus anda pose of the apparatus.

The generating of the second image may include placing a virtual cameraat the location on the map data, and adjusting a pose of the virtualcamera based on the pose of the apparatus, and generating an image of aviewpoint at which the object is viewed from the virtual camera.

The directional characteristic may correspond to a probabilitydistribution indicating a degree of closeness to the object.

The input image may be based on an output of a first sensor, and thelocalization information may be based on an output of a second sensor.

The localization method may include determining a virtual object on themap data for an augmented reality (AR) service, and displaying thevirtual object and the input image may be based on the adjustedlocalization information.

The virtual object may represent driving route information.

The generating of the first image may include generating a probabilitymap that represents the directional characteristic using a trainedneural network.

Each pixel in the probability map may be configured to store a distancefrom the each pixel to a closest seed pixel.

The seed pixel may include a pixel corresponding to the object amongpixels included in the input image.

The generating of the second image may include generating the secondimage using a transformer configured to transform a coordinate system ofthe map data to a coordinate system of the second image.

The localization information may include 6 degrees of freedom (6DoF).

The updating of the localization information may include calculating adegree of the visual alignment by matching the first image and thesecond image, and modifying the localization information to increase thedegree of the virtual alignment based on the directional characteristic.

The calculating may include adding up values of pixels corresponding tothe object in the second image from among pixels in the first image.

The modifying of the localization information based on the directionalcharacteristic may include modifying the localization information totransform the object in the second image based on the directionalcharacteristic.

The modifying of the localization information based on the directionalcharacteristic may include moving or rotating the object in the secondimage based on the directional characteristic.

The first image may be configured to classify the object based on anobject type and to store a directional characteristic for each objecttype, and the second image may be configured to classify the objectbased on the object type and to store the projected object for the eachobject type.

The modifying may include calculating a degree of visual alignment foreach object type by matching the first image and the second image, andmodifying the localization information to increase the degree of visualalignment based on the directional characteristic.

The input image may include a driving image of a vehicle.

The object may include any one or any combination of a line, a roadsurface marking, a traffic light, a sign, a curb stone, and a structure.

In another general aspect, there is provided a learning method includingreceiving a learning image, generating a reference image including adirectional characteristic corresponding to an object in the learningimage, based on map data for the learning image, generating an inferenceimage that infers the directional characteristic corresponding to theobject in the learning image, using a neural network, and training theneural network based on a difference between the reference image and theinference image.

The training may include training the neural network to minimize thedifference between the reference image and the inference image.

The directional characteristic may correspond to a probabilitydistribution indicating a degree of closeness to the object.

Each pixel in the reference image and the inference image may beconfigured to store a distance from the each pixel to closest a seedpixel.

Each of the reference image and the inference image may be configured toclassify the object based on a type of the object and to store thedirectional characteristic for each object type.

The training may include training the neural network based on a typedifference between the reference image and the inference image.

The learning image may include a driving image of a vehicle.

The object may include any one or any combination of a line, a roadsurface marking, a traffic light, a sign, a curb stone, and a structure.

In another general aspect, there is provided a localization apparatusincluding sensors configured to acquire localization information of adevice and an input image, and a processor configured to generate afirst image including a directional characteristic corresponding to anobject included in the input image, to generate a second image in whichthe object is projected, based on the localization information, on tomap data corresponding to a location of the object, and to adjust thelocalization information based on visual alignment between the firstimage and the second image.

In another general aspect, there is provided a localization methodincluding acquiring an input image corresponding to a location of adevice, receiving map data corresponding to a location of an object,generating second images in which the object is projected, based onplurality of respective candidate localization information, on to themap data, calculating a degree of visual alignment for each of thesecond images by matching the input image and the each of the secondimages, selecting a second image having the greatest degree of visualalignment from the second images, and updating localization informationbased on a candidate localization information corresponding to theselected second image.

The localization method may include generating a first image comprisinga probability map indicating a directional characteristic correspondingto the object, and wherein the calculating of the degree of visualalignment may include matching the first image and the each of thesecond images.

In another general aspect, there is provided a localization apparatusincluding a first sensor configured to capture an image, a second sensorconfigured to acquire localization information of a device a head-updisplay (HUD), a processor configured to generate a first imageincluding a directional characteristic corresponding to an objectincluded in the image, generate a second image in which the object isprojected, based on the localization information, on to map datacorresponding to a location of the object, update the localizationinformation based on visual alignment between the first image and thesecond image, and display the object and the input image on to the mapdata based on the adjusted localization information in the HUD for anaugmented reality (AR) service.

Other features and aspects will be apparent from the following detaileddescription, the drawings, and the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIGS. 1A, 1B, and 1C illustrate examples of a visual alignment resultcorresponding to a localization error.

FIG. 2 is a diagram illustrating an example of a localization method.

FIG. 3 illustrates an example of a localization method.

FIG. 4 illustrates an example of a localization process.

FIG. 5 illustrates an example of a process of generating a first image.

FIG. 6 illustrates an example of a method of modifying localizationinformation.

FIGS. 7A and 7B illustrate examples of a method of modifyinglocalization information.

FIG. 8 is a diagram illustrating an example of a learning method.

FIG. 9 illustrates an example of a learning process.

FIG. 10 illustrates an example of images for learning.

FIG. 11 illustrates an example of a learning method.

FIG. 12 illustrates an example of a localization updating process ofFIG. 11.

FIG. 13 is a diagram illustrating an example of a localization method.

FIG. 14 is a diagram illustrating an example of a localizationapparatus.

Throughout the drawings and the detailed description, unless otherwisedescribed or provided, the same drawing reference numerals will beunderstood to refer to the same elements, features, and structures. Thedrawings may not be to scale, and the relative size, proportions, anddepiction of elements in the drawings may be exaggerated for clarity,illustration, and convenience.

DETAILED DESCRIPTION

The following detailed description is provided to assist the reader ingaining a comprehensive understanding of the methods, apparatuses,and/or systems described herein. However, various changes,modifications, and equivalents of the methods, apparatuses, and/orsystems described herein will be apparent after an understanding of thedisclosure of this application. For example, the sequences of operationsdescribed herein are merely examples, and are not limited to those setforth herein, but may be changed as will be apparent after anunderstanding of the disclosure of this application, with the exceptionof operations necessarily occurring in a certain order. Also,descriptions of features that are known in the art may be omitted forincreased clarity and conciseness.

The features described herein may be embodied in different forms and arenot to be construed as being limited to the examples described herein.Rather, the examples described herein have been provided merely toillustrate some of the many possible ways of implementing the methods,apparatuses, and/or systems described herein that will be apparent afteran understanding of the disclosure of this application.

Although terms such as “first,” “second,” “third” “A,” “B,” (a), and (b)may be used herein to describe various members, components, regions,layers, or sections, these members, components, regions, layers, orsections are not to be limited by these terms. Rather, these terms areonly used to distinguish one member, component, region, layer, orsection from another member, component, region, layer, or section. Thus,a first member, component, region, layer, or section referred to inexamples described herein may also be referred to as a second member,component, region, layer, or section without departing from theteachings of the examples.

If the specification states that one component is “connected,”“coupled,” or “joined” to a second component, the first component may bedirectly “connected,” “coupled,” or “joined” to the second component, ora third component may be “connected,” “coupled,” or “joined” between thefirst component and the second component. However, if the specificationstates that a first component is “directly connected” or “directlyjoined” to a second component, a third component may not be “connected”or “joined” between the first component and the second component.Similar expressions, for example, “between” and “immediately between”and “adjacent to” and “immediately adjacent to,” are also to beconstrued in this manner.

The terminology used herein is for the purpose of describing particularexamples only, and is not intended to limit the disclosure or claims.The singular forms “a,” “an,” and “the” include the plural forms aswell, unless the context clearly indicates otherwise. The terms“comprises,” “comprising,” “includes,” and “including” specify thepresence of stated features, numbers, operations, elements, components,or combinations thereof, but do not preclude the presence or addition ofone or more other features, numbers, operations, elements, components,or combinations thereof.

The use of the term ‘may’ herein with respect to an example orembodiment, e.g., as to what an example or embodiment may include orimplement, means that at least one example or embodiment exists wheresuch a feature is included or implemented while all examples andembodiments are not limited thereto.

Hereinafter, the example embodiments are described with reference to theaccompanying drawings. Like reference numerals used herein may refer tolike elements throughout.

FIGS. 1A, 1B, and 10 illustrate examples of a visual alignment resultcorresponding to a localization error.

Augmented reality (AR) refers to adding or augmenting information basedon reality to images and providing the added or augmented information.For example, the AR may provide an image in which a virtual objectcorresponding to a virtual image is added to an image or a backgroundimage of a real world. Since the real world and the virtual world areharmonized in the AR, a user may experience sense of immersion thatenables real-time interaction between the user and the virtual world,without recognizing that a virtual environment is distinct from a realenvironment. To match the virtual object with a real image, a locationand a pose, i.e., localization information of a user device or the userto which the AR is to be provided should be verified.

The localization information is used to locate a virtual object at adesired location in an image. In AR, a degree of visual alignment whenprojecting the virtual object onto a two-dimensional (2D) image is moreimportant than an error occurring on an actual three-dimensional (3D)space or an error occurring in feature matching. The degree of visualalignment corresponds to, for example, an overlapping ratio or amatching ratio between the virtual object and the real image. The degreeof visual alignment varies based on a localization error as shown inFIGS. 1A and 1B. Hereinafter, an example of displaying a driving guidelane corresponding to a virtual object on a road surface is described asan example.

In the following description, the examples described herein may be usedto generate information to support a driver or to control an autonomousvehicle. The examples described herein may also be used to interpretvisual information in a device, such as, for example, an intelligentsystem installed for fully autonomous driving or driving assistance in avehicle, and used to assist safe and comfortable driving. The examplesdescribed herein may be applicable to vehicles and vehicle managementsystems such as, for example, an autonomous vehicle, an automatic orautonomous driving system, an intelligent vehicle, an advanced driverassistance system (ADAS), a navigation system to assist a vehicle withsafely maintaining a lane on which the vehicle is travelling, asmartphone, or a mobile device. The examples related to displaying adriving guide lane corresponding to a virtual object is provided as anexample only, and other examples such as, for example, training, gaming,applications in healthcare, public safety, tourism, and marketing areconsidered to be well within the scope of the present disclosure.

Referring to FIG. 1A, an AR image 120 is based on a visual alignmentresult when a localization error is small. Referring to FIG. 1B, an ARimage 140 is based on a visual alignment result when the localizationerror is large.

For example, a reference path of a vehicle is displayed on a road imagebased on localization information of an object 110. In an example, theobject 110 corresponds to a user terminal and/or the vehicle thatperforms localization. When a localization error of the object 110 issmall, a driving guide lane 115, i. e., a virtual object to bedisplayed, may be visually well aligned with an actual road image asshown in the AR image 120. When a localization error of an object 130 islarge, a driving guide lane 135 that is a virtual object to be displayedmay not be visually aligned with an actual road image as shown in the ARimage 140.

In an example, an accurate AR service may be provided by optimizinglocalization information to increase a degree of visual alignment whenprojecting a virtual object onto a two-dimensional (2D) image.

Referring to FIG. 10, localization information includes a location and apose of an apparatus. The location corresponds to 3D coordinates (x, y,z). Here, x coordinate denotes a lateral location t_(x), y coordinatedenotes a vertical location t_(y), and z coordinate denotes alongitudinal location t_(z). Also, the pose corresponds to pitch r_(x),yaw r_(y), and roll r_(z). For example, the location is acquired using,for example, a geographical positioning system (GPS) sensor and a lidar,and the pose is acquired using, for example, an inertial measurementunit (IMU) sensor and a gyro sensor. The localization information may beunderstood to include 6 degrees of freedom (6DoF) that includes thelocation and the pose.

In an example, the vehicle described herein refers to any mode oftransportation, delivery, or communication such as, for example, anautomobile, a truck, a tractor, a scooter, a motorcycle, a cycle, anamphibious vehicle, a snowmobile, a boat, a public transit vehicle, abus, a monorail, a train, a tram, an autonomous or automated drivingvehicle, an intelligent vehicle, a self-driving vehicle, an unmannedaerial vehicle, an electric vehicle (EV), a hybrid vehicle, a smartmobility device, a ADAS, or a drone. In an example, the smart mobilitydevice includes mobility devices such as, for example, electric wheels,an electric kickboard, and an electric bike. In an example, vehiclesinclude motorized and non-motorized vehicles, for example, a vehiclewith a power engine (for example, a cultivator or a motorcycle), abicycle or a handcart.

The term “road” is a thoroughfare, route, or connection, between twoplaces that has been improved to allow travel by foot or some form ofconveyance, such as a vehicle. A road can include various types of roadssuch as, for example, highways, national roads, farm roads, local roads,or high-speed national roads. A road may include a single lane or aplurality of lanes. Lanes correspond to road spaces that aredistinguished from each other by road lines marked on a surface of aroad. In an example, a “lane” is a space of a plane on which a vehicleis traveling among a plurality of lanes, i.e., as a space occupied andused by the vehicle. One lane is distinguished from the other lanes byright and left markings of the lane.

Also, the term “line” may be understood as various types of lines, forexample, a solid line, a dotted line, a curved line, and a zigzaggedline, which are marked in colors such as, white, blue, or yellow on theroad surface. The line may correspond to a line on one side thatdistinguishes a single lane and may also be a pair of lines, that is, aleft lane and a right lane corresponding to a lane boundary line thatdistinguishes a single lane a center line of road, and a stop line. Inaddition, a line may indicate an area prohibited for parking andstopping, a crosswalk, a towaway zone, and indication of speed limit.

The following examples may be applied for an AR navigation device in asmart vehicle, for example. In an example, the AR navigation device isused to mark a line, to generate visual information to help steering ofan autonomous driving vehicle, or to provide a variety of controlinformation for driving of a vehicle. The AR navigation device mayprovide visual information to a display. In an example, the display is ahead-up display (HUD), a vehicular infotainment system, a dashboard in avehicle, or a screen in the vehicle that used augmented reality. Thedisplay is installed for driving assistance or complete autonomousdriving of a vehicle and to assist safe and pleasant driving. In anexample, the display may also be implemented as an eye glass display(EGD), which includes one-eyed glass or two-eyed glasses.

FIG. 2 is a diagram illustrating an example of a localization method.The operations in FIG. 2 may be performed in the sequence and manner asshown, although the order of some operations may be changed or some ofthe operations omitted without departing from the spirit and scope ofthe illustrative examples described. Many of the operations shown inFIG. 2 may be performed in parallel or concurrently. One or more blocksof FIG. 2, and combinations of the blocks, can be implemented by specialpurpose hardware-based computer that perform the specified functions, orcombinations of special purpose hardware and computer instructions. Inaddition to the description of FIG. 2 below, the descriptions of FIG. 1are also applicable to FIG. 2, and are incorporated herein by reference.Thus, the above description may not be repeated here.

Referring to FIG. 2, in operation 210, a localization apparatus acquireslocalization information of a corresponding apparatus. Here, theapparatus refers to an apparatus that performs the localization method,such as, for example, a vehicle, a navigation device, and a user devicesuch as a smartphone. The localization information may have 6DoF thatincludes a location and a pose of the apparatus. The localizationinformation is acquired based on output of a sensor, such as, forexample, an IMU sensor, a GPS sensor, a lidar sensor, and a radar. Thelocalization information may be, for example, initial localizationinformation of the localization apparatus.

In operation 220, the localization apparatus generates a first imagethat includes a directional characteristic corresponding to an objectincluded in an input image. The input image may correspond to abackground image or another image that is displayed with a virtualobject for an AR service. For example, the input image includes adriving image of the vehicle. The driving image may be captured from,for example, a photographing device mounted to the vehicle. In anexample, the driving image includes a plurality of frames. Thelocalization apparatus acquires the input image based on output of thephotographing device. The photographing device is fastened at alocation, such as, a for example, a windshield, a dashboard, a frontfender, and a rear-view mirror, to capture an image ahead of thevehicle. The photographing device may include, for example, a visionsensor, an image sensor, or a device that performs a similar function.The photographing device may capture a single image or may capture animage per frame if needed. In another example, the driving image may becaptured from another apparatus, aside from the localization apparatus.The driving image may be an input image 410 of FIG. 4. In one example,objects include a line, a road surface marking, a traffic light, a sign,a curb stone, a pedestrian, other vehicles, and a structure.

In operation 220, the localization apparatus generates a probabilitymap, for example, a distance field map, indicating directionalcharacteristic using a pretrained neural network. In an example,“directional characteristic corresponding to an object” may correspondto a probability distribution indicating a degree of closeness to theobject. Here, each of pixels included in the probability map stores adistance from the corresponding pixel to a seed pixel closest. The seedpixel may be a pixel corresponding to the object among pixels includedin an image. The first image may be, for example, a distance field map550 of FIG. 5. A method of generating the probability map using thelocalization apparatus will be further described with reference to FIG.5.

In operation 230, the localization apparatus generates a second image inwhich the object is projected according to the localization informationbased on map data including a location of the object. In an example, themap data is high-density (HD) map data. An HD map refers to athree-dimensional (3D) map having high density, for example,centimeter-based density, for autonomous driving. Line-basedinformation, for example, a center line of road and a boundary line, andinformation, for example, a traffic light, a sign, a curb stone, a roadsurface marking, and various structures, may be included in the HD mapin a 3D digital format. The HD map may be built using, for example, amobile mapping system (MMS). The MMS refers to a 3D spatial informationinvestigation system including various sensors, and may include a movingobject with a sensor, such as, for example, a camera, a lidar, and a GPSfor measurement of a location and geographical features. Sensors of theMMS interact with each other flexibly and acquire various and preciselocation information.

In operation 230, generating the second image in which the object isprojected according to the localization information based on the mapdata means, for example, that the localization apparatus places avirtual camera at a location included in the localization information,on the map data, adjusts a pose of the virtual camera based on a poseincluded in the localization information, and generates an image of aviewpoint at which the object is viewed from the virtual camera.

In operation 230, the localization information generates the secondimage using, for example, a transformer that transforms a coordinatesystem of map data to a coordinate system of the second image. Here, thetransformer may be, for example, a homographic function representing atransformation relationship between corresponding points when projectingone plane onto another plane or an artificial neural network thatperforms the transformation. The localization apparatus extracts partialdata from the map data based on the localization information andgenerates the second image from the extracted partial data using thetransformer. The second image may be, for example, a second image 430 ofFIG. 4.

In operation 240, the localization apparatus modifies or adjusts thelocalization information based on visual alignment between the firstimage and the second image. The localization apparatus calculates adegree of the visual alignment by matching the first image and thesecond image. For example, the localization apparatus adds up values ofpixels corresponding to an object included in the second image among aplurality of pixels included in the first image and determines a resultof the addition as the degree of visual alignment. The degree of visualalignment may be represented in, for example, a gradient descent form.In an example, the localization apparatus modifies the localizationinformation to increase the degree of visual alignment based on thedirectional characteristic corresponding to the object. The localizationapparatus modifies the localization information to transform, forexample, move or rotate, the object included in the second image basedon the directional characteristic. A method of modifying, by thelocalization apparatus, the localization information will be furtherdescribed with reference to FIG. 6.

The localization apparatus determines the virtual object on the map datafor the AR service. For example, the virtual object may representdriving route information using an arrow indicator or a road markingindicating a direction of progress. The localization apparatus maydisplay the virtual object and the input image on, for example, ahead-up display (HUD), a navigation system, or a display of a userdevice based on the localization information modified in operation 240.

FIG. 3 illustrates an example of a localization method, and FIG. 4illustrates an example of describing a localization process. Theoperations in FIGS. 3-4 may be performed in the sequence and manner asshown, although the order of some operations may be changed or some ofthe operations omitted without departing from the spirit and scope ofthe illustrative examples described. Many of the operations shown inFIGS. 3-4 may be performed in parallel or concurrently. One or moreblocks of FIGS. 3-4, and combinations of the blocks, can be implementedby special purpose hardware-based computer that perform the specifiedfunctions, or combinations of special purpose hardware and computerinstructions. In addition to the description of FIGS. 3-4 below, thedescriptions of FIGS. 1-2 are also applicable to FIGS. 3-4, and areincorporated herein by reference. Thus, the above description may not berepeated here.

Referring to FIGS. 3 and 4, in operation 310, the localization apparatusacquires the input image 410. For example, the localization apparatusreceives the input image 410 generated by capturing an object using animage sensor. Here, the input image 410 may be an image corresponding toa current location of an apparatus.

In operation 320, the localization apparatus receives or acquires mapdata that includes a location of the object.

In operation 330, the localization apparatus estimates the object fromthe input image 410 acquired in operation 310. The localizationapparatus may generate a first image 420 that includes a directionalcharacteristic corresponding to the object based on the input image 410.In an example, the localization apparatus generates the first image 420using a pretrained neural network. The localization apparatus may usethe pretrained neural network to stably estimate the object regardlessof various obstacles, such as, for example, a vehicle, a pedestrian, anda street tree, in the input image 410. For example, when the input image410 is applied, the pretrained neural network generates the first image420 by activating a portion corresponding to a line in the applied inputimage 410. The first image 420 may include, for example, a 2D distancefield map.

In operation 340, the localization apparatus performs initiallocalization. In operation 350, the localization apparatus generates thesecond image 430 in which the object is projected based on the initiallocalization by applying localization information corresponding to theinitial localization on the map data that is acquired in operation 320.

The localization apparatus performs visual alignment on the first image420 and the second image 430. The localization apparatus visually alignsthe first image 420 and the second image 430, as shown in an image 440.

In operation 360, the localization apparatus optimizes the visualalignment. The localization apparatus calculates a localizationmodification value so that the first image 420 and the second image 430may maximally overlap through the visual alignment. The localizationapparatus may optimize the visual alignment by changing the localizationinformation to maximize overlapping between the first image 420 and thesecond image 430 based on the initial localization performed inoperation 340. In one example, gradient-based optimization may bereadily performed using the first image 420 in which information isspread over the entire image, i.e., the distance field map.

In operation 370, the localization apparatus applies the localizationmodification value to the localization information and updates thelocalization information to optimize the visual alignment between thefirst image 420 and the second image 430, as shown in an image 450.

FIG. 5 illustrates an example of describing a process of generating afirst image. A process of applying an input image 510 to a neuralnetwork 530 and generating a distance field map 550 corresponding to afirst image will be described with reference to FIG. 5.

The neural network 530 refers to a neural network that is pretrained togenerate a first image including a directional characteristiccorresponding to an object included in the input image 510, based on theinput image 510. The neural network 530 may be a deep neural network(DNN) or an n-layer neural network. The DNN or n-layer neural networkmay correspond to a convolutional neural network (CNN), a recurrentneural network (RNN), a deep belief network, a fully connected network,a bi-directional neural network, a restricted Boltzman machine, and abidirectional long short term memory (BLSTM), or may include differentor overlapping neural network portions respectively with full,convolutional, recurrent, and/or bi-directional connections. Forexample, the neural network 53—may be embodied as a CNN, but is notlimited thereto.

The neural network 530 may be embodied as an architecture having aplurality of layers including an input image, feature maps, and anoutput. In the neural network 530, a convolution operation is performedon the input image with a filter referred to as a kernel, and as aresult, the feature maps are output. The convolution operation isperformed again on the output feature maps as input feature maps, with akernel, and new feature maps are output. When the convolution operationis repeatedly performed as such, a recognition result with respect tofeatures of the input image may be finally output through the neuralnetwork 530. In an example, in addition to the convolution layers, theneural network 530 may include a pooling layer or a fully connectedlayer.

The neural network 530 estimates the object included in the input image510 based on a form of the distance field map 550. For example, when thefirst image includes directional characteristic information associatedwith a close object as in the distance field map 550, a directionalcharacteristic of optimization may be readily determined using agradient descent scheme. When the probability distribution indicating adegree of closeness to the object is distributed over the overall imageas in the distance field map 550, an amount of data for learning may beincreased. Performance of the neural network 530 may be enhancedcompared to learning by sparse data.

FIG. 6 illustrates an example of describing a method of modifyinglocalization information. FIG. 6 illustrates an input image 605, a firstimage 610, and a second image 620. The first image 610 is generatedbased on the input image 605. The second image 620 is generated byprojecting an object based on localization information (x, y, z, r_(x),r_(y), r_(z))

corresponding to an initial localization, based on map data.

The localization apparatus matches the first image 610 and the secondimage 620 to be an image 630 and calculates a degree of visual alignmenttherebetween in a form of, for example, score

. The localization apparatus adds up values of pixels corresponding toan object included in the second image 620 from among a plurality ofpixels included in the first image 610 and calculates an addition resultin the form of a score.

For example, each of the plurality of pixels included in the first image610 may have a value between 0 and 1 based on a distance from an objectadjacent to the respective pixel. Each pixel may have a value close to 1as the distance from the adjacent object decreases and may have a valueclose to 0 as the distance from the adjacent object increases. Thelocalization apparatus extracts pixels that match the second image 620from among the plurality of pixels included in the first image 610, addsup values of the extracted pixels, and calculates a score.

The localization apparatus modifies localization information to increasethe degree of visual alignment, i.e., the score

based on the directional characteristic of the first image 610. Thelocalization apparatus calculates a localization modification value sothat the localization of the object included in the second image 620fits the directional characteristic of the first image 610. Thelocalization apparatus updates the localization information to be

→

′ 640 by applying the localization modification value to thelocalization information corresponding to the initial localization. Forexample, the localization apparatus determines a direction in which theobject of the second image 620 is to be moved to increase the score,based on the directional characteristic included in the first image 610.Once the localization information is updated, the object of the secondimage 620 is moved. Thus, the localization apparatus updates thelocalization information based on the directional characteristicincluded in the first image 610.

The localization apparatus generates an updated second image 650 basedon the updated localization information

′. The localization apparatus calculates a score

′ by matching the updated second image 650 and the first image 610.

The localization apparatus calculates a localization modification valueto maximize the score through the aforementioned process and outputsoptimized localization information

*.

FIGS. 7A and 7B illustrate examples of describing a method of modifyinglocalization information.

FIG. 7A illustrates an example in which objects included in an inputimage is not classified for each type. For example, when a line 710 anda line 730 correspond to objects included in a first image and a line720 is included in a second image, the localization apparatus modifieslocalization information to increase a degree of visual alignmentbetween the first image and the second image calculated by matching thefirst image and the second image. In one example, when each object typeis not distinguished as illustrated in FIG. 7A, the localizationapparatus may not accurately verify whether to match the line 720 andthe line 710 or whether to match the line 720 and the line 730, whichmay make it difficult to accurately modify the localization information.

FIG. 7B illustrates an example in which objects in an input image areclassified according to a type of the object. The first image mayclassify the object based on a type of the object and may store adirectional characteristic for each type of object. Also, the secondimage may classify the object based on a type of the object and maystore a projected object for each object type. For example, a line 740and a line 760 may correspond to objects included in the first image.The line 740 may correspond to a first type (Type 1) and the line 760may correspond to a second type (Type 2). Also, a line 750 may beincluded in the second image and correspond to the first type (Type 1).

Referring to FIG. 7B, when each of the objects is classified for eachtype, the localization apparatus matches the first image and the secondimage and calculates a degree of visual alignment for each object type.The localization apparatus modifies the localization information toincrease the degree of visual alignment for each type based on thedirectional characteristic.

For example, the localization apparatus may calculate a degree of visualalignments for objects, the lines 740 and 750, corresponding to thefirst type (Type 1) and may modify localization information to increasethe degree of visual alignment corresponding to the first type based onthe directional characteristic. The localization apparatus may modifythe localization information to match the objects that are the lines 740and 750.

FIG. 8 is a diagram illustrating an example of a learning method, andFIG. 9 illustrates an example of a learning process. The operations inFIG. 8 may be performed in the sequence and manner as shown, althoughthe order of some operations may be changed or some of the operationsomitted without departing from the spirit and scope of the illustrativeexamples described. Many of the operations shown in FIG. 8 may beperformed in parallel or concurrently. One or more blocks of FIG. 8, andcombinations of the blocks, can be implemented by special purposehardware-based computer that perform the specified functions, orcombinations of special purpose hardware and computer instructions. Atraining of a neural network, by a learning apparatus, to generate afirst image from an input image to optimize localization information forAR will be described with reference to FIGS. 8, 9 and 10. In addition tothe description of FIG. 8 below, the descriptions of FIGS. 1-7B are alsoapplicable to FIG. 8, and are incorporated herein by reference. Thus,the above description may not be repeated here.

Referring to FIGS. 8 and 9, in operation 810, the learning apparatusreceives a learning image 910. The learning image 910 may include, forexample, a driving image of a vehicle. The learning image 910 may be,for example, a learning image 1010 of FIG. 10.

In operation 820, the learning apparatus generates a reference image 950that includes a directional characteristic corresponding to an objectincluded in the learning image 910, based on map data 940 for thelearning image 910. In an example, the directional characteristiccorresponds to a probability distribution indicating a degree ofcloseness to an object. The reference image 950 may correspond to aground truth (GT) image. The reference image 950 may be, for example, areference image 1030 or a reference image 1040 of FIG. 10.

In operation 830, the learning apparatus generates an inference image930 that infers the directional characteristic corresponding to theobject included in the learning image 910, using a neural network 920.The neural network 920 may be, for example, the neural network 530 ofFIG. 5. A method of generating, by the learning apparatus, the inferenceimage 930 will be further described with reference to FIG. 10.

In operation 840, the learning apparatus trains the neural network 920based on a difference, for example, a loss 960 between the referenceimage 950 and the inference image 930. The learning apparatus may trainthe neural network 920 to minimize the difference between the referenceimage 950 and the inference image 930. The learning apparatus may trainthe neural network 920 through, for example, supervised learning. Thelearning apparatus may update the neural network 920 through a gradientdescent scheme based on the loss 930 that is back-propagated to theneural network 920 through back-propagation learning and output valuesof nodes included in the neural network 920. The back-propagationlearning refers to a method of estimating the loss 960 by performingforward computation on the reference image 950 and updating the neuralnetwork 920 to reduce the loss 960 while propagating the estimated loss960 starting from an output layer of the neural network 920 toward ahidden layer and an input layer.

FIG. 10 illustrates an example of images for learning. FIG. 10illustrates the learning image 1010, a map data image 1020, and thereference images 1030 and 1040.

Referring to FIG. 10, the learning apparatus trains a neural network toestimate the reference images 1030 and 1040 from the learning image1010. In an example, the map data image 1020 represents objects in thelearning image 1010 using a discrete binary value. Thus, when using themap data image 1020, learning information may be too sparse to smoothlyperform learning. In an example, learning may be performed using adistance field map, such as the reference images 1030 and 1040. Sparselearning information may be spread across the overall image through thedistance field map. When learning information is present over the entiretarget image as in the distance field map, it is possible to train theneural network based on sufficient learning information.

In an example, the learning apparatus generates the reference image 1030or the reference image 1040 from the map data image 1020 by adjusting animportance of spread information in a distance field. For example, thelearning apparatus may adjust an importance of spread information in thedistance field, such as e^(−0.02d). Here, d denotes a distance between aseed pixel corresponding to an object and a corresponding pixel.

FIG. 11 illustrates an example of a learning method, and FIG. 12illustrates an example of a localization updating process of FIG. 11.The operations in FIG. 11 may be performed in the sequence and manner asshown, although the order of some operations may be changed or some ofthe operations omitted without departing from the spirit and scope ofthe illustrative examples described. Many of the operations shown inFIG. 11 may be performed in parallel or concurrently. One or more blocksof FIG. 11, and combinations of the blocks, can be implemented byspecial purpose hardware-based computer that perform the specifiedfunctions, or combinations of special purpose hardware and computerinstructions. In addition to the description of FIG. 11 below, thedescriptions of FIGS. 1-10 are also applicable to FIG. 11, and areincorporated herein by reference. Thus, the above description may not berepeated here.

Referring to FIGS. 11 and 12, in operation 1110, the localizationapparatus acquires an input image. In an example, the input image refersto an image corresponding to a current location of a correspondingapparatus.

In operation 1120, the localization apparatus receives or acquires mapdata that includes a location of an object. In operation 1140, thelocalization apparatus applies a plurality of pieces of candidatelocalization information to the map data acquired in operation 1120. Inoperation 1150, the localization apparatus generates second candidateimages each in which an object is projected based on the candidatelocalization information. For example, the localization apparatus maygenerate second candidate images, for example, a first candidate image(candidate 1) 1210 and a second candidate image (candidate 2) 1220 ofFIG. 12, each to which the candidate localization information isapplied.

In operation 1130, the localization apparatus scores visual alignmentbetween the input image and each of the second candidate images. Forexample, a degree of visual alignment between the input image and thefirst candidate image 1210 is scored as 0.43 and a degree of visualalignment between the input image and the second candidate image 1220 isscored as 0.98.

In operation 1160, the localization apparatus searches for a best scorehaving a highest value from among the scores output in operation 1130.Referring to FIG. 12, the localization apparatus retrieves 0.98 as thebest score from among the scores 0.43 and 0.98. In operation 1170, thelocalization apparatus updates the localization information by selectinga candidate localization corresponding to the best score retrieved inoperation 1160.

FIG. 13 is a diagram illustrating an example of a localization method.The operations in FIG. 13 may be performed in the sequence and manner asshown, although the order of some operations may be changed or some ofthe operations omitted without departing from the spirit and scope ofthe illustrative examples described. Many of the operations shown inFIG. 13 may be performed in parallel or concurrently. One or more blocksof FIG. 13, and combinations of the blocks, can be implemented byspecial purpose hardware-based computer that perform the specifiedfunctions, or combinations of special purpose hardware and computerinstructions. In addition to the description of FIG. 13 below, thedescriptions of FIGS. 1-12 are also applicable to FIG. 13, and areincorporated herein by reference. Thus, the above description may not berepeated here.

Referring to FIG. 13, the localization apparatus generates a firstimage, for example, a distance field map, from an input image byestimating an object in operation 1330 prior to scoring a degree ofvisual alignment in operation 1360. The localization apparatuscalculates scores between the first image and second candidate images.

FIG. 14 is a diagram illustrating an example of a localizationapparatus. Referring to FIG. 14, a localization apparatus 1400 includessensors 1410 and a processor 1430. The localization apparatus 1400further includes a memory 1450, a communication interface 1470, and adisplay device 1490. The sensors 1410, the processor 1430, the memory1450, the communication interface 1470, and the display device 1490 areconnected to each other through a communication bus 1405.

The sensors 1410 may include, for example, an image sensor, a visionsensor, an accelerometer sensor, a gyro sensor, a GPS sensor, an IMUsensor, a radar, and a lidar. The sensor(s) 1410 may acquire an inputimage that includes a driving image of a vehicle. The sensor(s) 1410 maysense sensing information, for example, speed, acceleration, drivingdirection, handle steering angle of the vehicle, and a speed of thevehicle, in addition to localization information, for example, GPScoordinates, a location, and a pose of the vehicle.

In an example, the processor 1430 generates a first image that includesa directional characteristic corresponding to an object included in theinput image. In an example, the processor 1430 generates a second imagein which the object is projected based on localization information,based on map data that includes the location of the object. In anexample, the processor 1430 modifies the localization information basedon visual alignment between the first image and the second image.

In an example, the localization apparatus 1400 acquires a variety ofsensing information including the input image from various sensorsthrough the communication interface 1470. In one example, thecommunication interface 1470 receives sensing information including adriving image from other sensors outside the localization apparatus1400.

The processor 1430 provides an AR service by outputting the modifiedlocalization information through the communication interface 1470 and/orthe display device 1490 or by displaying a virtual object and the inputimage on map data based on the modified localization information. Also,the processor 1430 may perform the one or more methods described withreference to FIGS. 1 to 13 or an algorithm corresponding thereto.

The processor 1430 refers to a data processing device configured ashardware with a circuitry in a physical structure to execute desiredoperations. For example, the desired operations may include codes orinstructions included in a program. For example, the data processingdevice configured as hardware may include a microprocessor, a centralprocessing unit (CPU), a processor core, a multicore processor, amultiprocessor, an application-specific integrated circuit (ASIC), and afield programmable gate array (FPGA). Further details on the processor1430 are provided below.

The processor 1430 executes the program and controls the localizationapparatus 1400. The program code executed by the processor 1430 may bestored in the memory 1450.

The memory 1450 stores the localization information of the localizationapparatus 1400, the first image, the second image, and/or the modifiedlocalization information. The memory 1450 stores a variety ofinformation that is generated during a processing process of theprocessor 1430. In an example, the memory 14450 stores the map data. Inaddition, the memory 1450 stores a variety of data and programs. Thememory 1450 may include, for example, a volatile memory or anon-volatile memory. The memory 1450 may include a mass storage medium,such as a hard disk, to store a variety of data. Further details on thememory 1450 are provided below.

The display device 1490 outputs the localization information modified bythe processor 1430, or displays a virtual object with the input image onmap data based on the modified localization information. In an example,the display device 1490 is a physical structure that includes one ormore hardware components that provide the ability to render a userinterface and/or receive user input. In an example, the localizationinformation or the virtual object with the input image on map data basedon the modified localization information is displayed on a wind shieldglass or a separate screen of the vehicle using a head-up display (HUD)or is displayed on an augmented reality head-up display (AR HUD). In anexample, the localization apparatus 1400 transmits the localizationinformation to an electronic control unit (ECU) or a vehicle controlunit (VCU) of a vehicle. The ECU or the VCU displays the localizationinformation on display device 1490 of the vehicle.

However, the displaying of the object is not limited to the exampledescribed above, and any other instrument cluster, vehicularinfotainment system, screen in the vehicle, or display panel in thevehicle may perform the display function. Other displays, such as, forexample, smart phone and eye glass display (EGD) that are operativelyconnected to the localization apparatus 1400 may be used withoutdeparting from the spirit and scope of the illustrative examplesdescribed.

In one example, the localization apparatus may perform the localizationmethod independent of a viewpoint by updating 3D localizationinformation of the localization apparatus using a result of performingthe localization method based on a photographing apparatus, although aviewpoint between the photographing device and the localizationapparatus does not match, such as, for example, an HUD and AR glasses.Also, when the viewpoint between the photographing device and thelocalization information matches, such as, for example, a mobileterminal and a smartphone, the localization apparatus may update 3Dlocalization information and, additionally, may directly use a 2Dlocation in an image for modification.

The localization apparatus, processor 1430 and other apparatuses, units,modules, devices, and other components described herein are implementedby hardware components. Examples of hardware components that may be usedto perform the operations described in this application whereappropriate include controllers, sensors, generators, drivers, memories,comparators, arithmetic logic units, adders, subtractors, multipliers,dividers, integrators, and any other electronic components configured toperform the operations described in this application. In other examples,one or more of the hardware components that perform the operationsdescribed in this application are implemented by computing hardware, forexample, by one or more processors or computers. A processor or computermay be implemented by one or more processing elements, such as an arrayof logic gates, a controller and an arithmetic logic unit, a digitalsignal processor, a microcomputer, a programmable logic controller, afield-programmable gate array, a programmable logic array, amicroprocessor, or any other device or combination of devices that isconfigured to respond to and execute instructions in a defined manner toachieve a desired result. In one example, a processor or computerincludes, or is connected to, one or more memories storing instructionsor software that are executed by the processor or computer. Hardwarecomponents implemented by a processor or computer may executeinstructions or software, such as an operating system (OS) and one ormore software applications that run on the OS, to perform the operationsdescribed in this application. The hardware components may also access,manipulate, process, create, and store data in response to execution ofthe instructions or software. For simplicity, the singular term“processor” or “computer” may be used in the description of the examplesdescribed in this application, but in other examples multiple processorsor computers may be used, or a processor or computer may includemultiple processing elements, or multiple types of processing elements,or both. For example, a single hardware component or two or morehardware components may be implemented by a single processor, or two ormore processors, or a processor and a controller. One or more hardwarecomponents may be implemented by one or more processors, or a processorand a controller, and one or more other hardware components may beimplemented by one or more other processors, or another processor andanother controller. One or more processors, or a processor and acontroller, may implement a single hardware component, or two or morehardware components. A hardware component may have any one or more ofdifferent processing configurations, examples of which include a singleprocessor, independent processors, parallel processors,single-instruction single-data (SISD) multiprocessing,single-instruction multiple-data (SIMD) multiprocessing,multiple-instruction single-data (MISD) multiprocessing, andmultiple-instruction multiple-data (MIMD) multiprocessing.

The methods that perform the operations described in this applicationare performed by computing hardware, for example, by one or moreprocessors or computers, implemented as described above executinginstructions or software to perform the operations described in thisapplication that are performed by the methods. For example, a singleoperation or two or more operations may be performed by a singleprocessor, or two or more processors, or a processor and a controller.One or more operations may be performed by one or more processors, or aprocessor and a controller, and one or more other operations may beperformed by one or more other processors, or another processor andanother controller. One or more processors, or a processor and acontroller, may perform a single operation, or two or more operations.

Instructions or software to control a processor or computer to implementthe hardware components and perform the methods as described above arewritten as computer programs, code segments, instructions or anycombination thereof, for individually or collectively instructing orconfiguring the processor or computer to operate as a machine orspecial-purpose computer to perform the operations performed by thehardware components and the methods as described above. In an example,the instructions or software includes at least one of an applet, adynamic link library (DLL), middleware, firmware, a device driver, anapplication program storing the method of preventing the collision. Inone example, the instructions or software include machine code that isdirectly executed by the processor or computer, such as machine codeproduced by a compiler. In another example, the instructions or softwareinclude higher-level code that is executed by the processor or computerusing an interpreter. Programmers of ordinary skill in the art canreadily write the instructions or software based on the block diagramsand the flow charts illustrated in the drawings and the correspondingdescriptions in the specification, which disclose algorithms forperforming the operations performed by the hardware components and themethods as described above.

The instructions or software to control computing hardware, for example,one or more processors or computers, to implement the hardwarecomponents and perform the methods as described above, and anyassociated data, data files, and data structures, may be recorded,stored, or fixed in or on one or more non-transitory computer-readablestorage media. Examples of a non-transitory computer-readable storagemedium include read-only memory (ROM), random-access memory (RAM),CD-ROMs, CD-Rs, CD+Rs, CD-RWs, CD+RWs, DVD-ROMs, DVD-Rs, DVD+Rs,DVD-RWs, DVD+RWs, DVD-RAMs, BD-ROMs, BD-Rs, BD-R LTHs, BD-REs, magnetictapes, floppy disks, magneto-optical data storage devices, optical datastorage devices, hard disks, solid-state disks, and any other devicethat is configured to store the instructions or software and anyassociated data, data files, and data structures in a non-transitorymanner and provide the instructions or software and any associated data,data files, and data structures to one or more processors or computersso that the one or more processors or computers can execute theinstructions. In one example, the instructions or software and anyassociated data, data files, and data structures are distributed overnetwork-coupled computer systems so that the instructions and softwareand any associated data, data files, and data structures are stored,accessed, and executed in a distributed fashion by the one or moreprocessors or computers.

While this disclosure includes specific examples, it will be apparentafter an understanding of the disclosure of this application thatvarious changes in form and details may be made in these exampleswithout departing from the spirit and scope of the claims and theirequivalents. The examples described herein are to be considered in adescriptive sense only, and not for purposes of limitation. Descriptionsof features or aspects in each example are to be considered as beingapplicable to similar features or aspects in other examples. Suitableresults may be achieved if the described techniques are performed in adifferent order, and/or if components in a described system,architecture, device, or circuit are combined in a different manner,and/or replaced or supplemented by other components or theirequivalents. Therefore, the scope of the disclosure is defined not bythe detailed description, but by the claims and their equivalents, andall variations within the scope of the claims and their equivalents areto be construed as being included in the disclosure.

What is claimed is:
 1. A localization method comprising: generating afirst image with a directional characteristic corresponding to a firstobject included in an input image, wherein pixel values in the firstimage indicate respective degrees of closeness to seed pixels of thefirst object; generating a second image to which a second objectincluded in three-dimensional (3D) map data is projected dependent onacquired localization information of a device, wherein the second objectis of a same object type as the first object; and updating thelocalization information based on a degree of visual alignment betweenthe first image and the second image, so as to increase the degree ofvisual alignment between the first image and the second image, whereinthe degree of visual alignment is dependent on a pooling of respectivevalues of pixels, of the first image, corresponding to pixels of thesecond object of the second image.
 2. The localization method of claim1, wherein the acquired localization information comprises a location ofthe device and a pose of the device.
 3. The localization method of claim2, wherein the generating of the second image comprises: adjusting apose of a virtual camera, corresponding to the location on the 3D mapdata, based on the pose of the device; and generating an image of aviewpoint at which the second object is viewed from the virtual camera.4. The localization method of claim 1, wherein the directionalcharacteristic corresponds to a probability distribution indicating therespective degrees of closeness.
 5. The localization method of claim 1,further comprising: determining a virtual object on the 3D map data foran augmented reality (AR) service; and displaying the virtual object andthe input image based on the updated localization information.
 6. Thelocalization method of claim 5, wherein the virtual object representsdriving route information.
 7. The localization method of claim 1,wherein the generating of the first image comprises generating, using atrained neural network, a probability map that represents thedirectional characteristic.
 8. The localization method of claim 7,wherein each pixel in the probability map is configured to store adistance from the each pixel to a corresponding closest seed pixel ofthe seed pixels.
 9. The localization method of claim 8, wherein the seedpixels are pixel corresponding to the first object among pixels includedin the input image.
 10. The localization method of claim 1, wherein thegenerating of the second image comprises generating the second imageusing a transformer configured to transform a coordinate system of the3D map data to a coordinate system of the second image.
 11. Thelocalization method of claim 1, wherein the first image is configured toclassify the first object based on an object type and to store adirectional characteristic for each object type, and the second image isconfigured to classify the second object based on the object type and tostore the projected object for the each object type.
 12. Anon-transitory computer-readable storage medium storing instructionsthat, when executed by a processor, cause the processor to perform thelocalization method of claim
 1. 13. A localization apparatus comprising:a processor configured to: generate a first image with a directionalcharacteristic corresponding to a first object included in an inputimage, wherein pixel values in the first image indicate respectivedegrees of closeness to seed pixels of the first object; generate asecond image to which a second object included in three-dimensional (3D)map data is projected dependent on acquired localization information,wherein the second object corresponds to the first object, and thesecond object is of a same object type as the first object; and adjustthe localization information based on a degree of visual alignmentbetween the first image and the second image, so as to increase thedegree of visual alignment between the first image and the second image,wherein the degree of visual alignment is calculated dependent on apooling of respective values of pixels, of the first image,corresponding to pixels of the second object of the second image. 14.The localization apparatus of claim 13, wherein the directionalcharacteristic corresponds to a probability distribution indicating therespective degrees of closeness.
 15. The localization apparatus of claim13, wherein the processor is further configured to determine a virtualobject on the 3D map data for an augmented reality (AR) service, andsynthesize the virtual object and the input image based on the adjustedlocalization information.
 16. The localization apparatus of claim 15,further comprising a display, wherein, for the synthesizing of thevirtual object and the input image, the processor is configured tocontrol the display to display a result of the synthetization.
 17. Thelocalization apparatus of claim 15, wherein the virtual objectrepresents driving route information.
 18. The localization apparatus ofclaim 13, wherein the processor is further configured to generate, usinga trained neural network, a probability map that represents thedirectional characteristic.
 19. The localization apparatus of claim 18,wherein each pixel in the probability map is configured to store adistance from the each pixel to a corresponding closest seed pixel ofthe seed pixels.
 20. The localization apparatus of claim 19, wherein theseed pixels are pixels corresponding to the first object among pixelsincluded in the input image.
 21. The localization apparatus of claim 13,wherein the first image is configured to classify the first object basedon an object type and to store a directional characteristic for eachobject type, and the second image is configured to classify the secondobject based on the object type and to store the projected object forthe each object type.
 22. A device comprising: a processor configuredto: generate a first image with a directional characteristiccorresponding to a first object included in a captured image, whereinpixel values in the first image indicate respective degrees of closenessto seed pixels of the first object; and adjust acquired localizationinformation of the device to reflect a first degree of visual alignment,between the first image and first image information of a firstprojection of a second object dependent on the acquired localizationinformation of the device, being less than a second degree of visualalignment between the first image and a second projection of the secondobject dependent on the adjusted localization information, wherein thefirst degree of visual alignment is dependent on a pooling of values ofpixels of the first image corresponding to pixels of the first imageinformation for the second object, and wherein the second object is asame object type as the first object, and the first projection of thesecond object dependent on the acquired localization information of thedevice is a projection of the second object as included inthree-dimensional (3D) map data.
 23. The device of claim 22, wherein theacquired localization information of the device reflects a location ofthe device and a pose of the device, and for the first projection of thesecond object the processor is configured to: adjust a pose of a virtualcamera, corresponding to the location on the 3D map data, dependent onthe pose of the device; and generate a second image, including the firstimage information, to which the second object included in thethree-dimensional (3D) map data is projected dependent on the acquiredlocalization information of the device, by generating an image of aviewpoint at which the second object is viewed from the virtual camera.24. The device of claim 23, wherein the first degree of visual alignmentis between the first image and the generated second image.
 25. Thedevice of claim 22, further comprising a display, wherein the processoris configured to: determine a virtual object on the 3D map data for anaugmented reality (AR) provision, and control the display to display thevirtual object based on the adjusted localization information.
 26. Thedevice of claim 25, wherein the display is a glass display.
 27. Thedevice of claim 25, further comprising: a camera, wherein the capturedimage is captured by the camera; and a localization information sensor,including any one or any combination of any two or more of anaccelerometer sensor, a gyro sensor, a GPS, an inertial measurement unit(IMU) sensor, a radar, and a lidar, wherein the acquired localizationinformation of the device is acquired using the localization informationsensor.